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Abstract — We compare two different techniques for proving 
non-Sliannon-type information inequalities. Tlie first one is the 
original Zhang- Yeung's method, commonly referred to as the 
copy/pasting lemma/trick. The copy lemma was used to derive 
the first conditional and unconditional non-Shannon-type inequal- 
' ities. The second technique first appeared in Makarychev et al 
paper |7| and is based on a coding lemma from Ahlswede and 
Korner works. We first emphasize the importance of balanced 
inequalities and provide a simpler proof of a theorem of Chan's 
for the case of Shannon-type inequalities. We compare the power 
of various proof systems based on a single technique. 

Index Terms — Information inequalities; non-Shannon-type; 
Balanced inequalities; proof techniques; 

I. Introduction 

Information inequalities are Unear inequalities for the Shan- 
, non entropy of random variables. They play a central role in in- 
formation theory for they tell us how much can information be 
. compressed, and are useful in many converse coding theorems. 
' Determining all the inequalities satisfied by the joint entropy, 
.and thus describing the so-called space of entropic vectors, 
'has become a major challenge in information theory. Apart 
from evident applications in all kinds of information-theoretic 
problems, more fundamental connections are known to exist 
.with matroid theory, Kolmogorov complexity, determinantal 
inequalities, combinatorics, or group theory. 

Shannon's seminal works 1 1 1 1 , 1 12 J of the 1940's intro- 
duced, amid many other things, the first information inequality 
commonly called the basic inequality: 

H{AC) + H{BC) > H{ABC) + H{C). 

Which, in the language of information theory, means that the 
conditional mutual information I{A:B\C) is non-negative. 

Positive linear combinations of instances of the basic in- 
equalities are called Shannon-type inequalities. The question 
of whether these Shannon-type inequalities are the only valid 
ones or not was raised by Pippenger [9 | in 1986, yet only 
answered more than 10 years later The first non-Shannon-type 
inequality was proven by Z. Zhang and R. W. Yeung in |17 | 
using the copy trick. Their technique has subsequently been 
used to find infinite families of non-Shannon-type inequalities 
(see IS, fSl, fT4l). A few years later, a different technique 
was discovered by K. Makarychev, Y. Makarychev, A. Ro- 
mashchenko and N. Vereshchagin (see Q) based on results on 
sub-achievable entropy vectors for the entropy characterization 



problem (see @ p. 352]). This new technique proved a 5- 
variable generalization of the original 4- variable Zhang- Yeung 
inequality. 

To the author's knowledge, these two techniques are the only 
ones known, to-date, for proving non-Shannon-type inequali- 
ties. The aim of this paper is to study and compare the power 
of these two techniques. The task of proving Shannon-type 
inequalities is known to be a LP problem and is better left to 
computer programs, e.g. ITIP 115J or Xitip [ lO]. Rules could 
be added to these programs for the derivation of non-Shannon- 
type inequalities using the two techniques we mentioned. 

We show that each technique can prove the same inequal- 
ities modulo rewriting inequalities in some equivalent form. 
Indeed, a result of Chan's work in [2| states that every 
information inequality can be equivalently put in balanced 
form. We present an elementary proof of this result for the 
particular case of Shannon-type inequalities, and argue that 
balanced inequalities play an important role in the comparison 
of the two techniques. 

After fixing notations, the rest of the paper is organized 
as follows. Section explains Chan's balanced inequalities, 
Zhang- Yeung and Makarychev et al respective techniques are 
presented in Section |IIT] Various proof systems involving the 
two different techniques are compared in Section HVI 

A. Preliminaries 

Let {Xi]i,^js/ be a collection of random variables indexed 
by a set N of n elements. For a non-empty subset J C M, 
we denote by Xj the set of random variables {Xj : j G J}. 

An unconditional linear information inequality for a set 
of n random variables is a linear form with 2" — 1 real 
coefficients {cj)0^jcM such that for all jointly distributed 
random variables {Xi}i^j^f, 



E 



cjH{Xj) > 0. 



We call Shannon-type the inequalities of the set of all pos- 
itive linear combinations of instances of the basic inequality. 
That is, a valid inequality that can be put in the form 



cj.k,lI{.Xj:Xk\Xl)>0, 



(1) 



where all cj^k,l are non-negative. 



II. Balanced Inequalities 

Definition 1 (Balanced Inequalities). An n-variable informa- 
tion inequality is said balanced for variable Xi if the sum of 
the coefficients involving Xi is zero: 

E ^^ = 0- 

An n-variable information is simply called balanced ;/ it is 
balanced for all of its n variables. 

Given a valid linear information inequality, can one obtain 
a balanced counterpart that is also a valid information inequal- 
ity? This question was answered in a paper of Chan's (see 

mi 

Theorem 1 (Balanced Inequalities, Chan Q). Let {cj)0-tj(zj\f 
be a list of coefficients, the following are equivalent: 

1) The inequality 

E cjH{Xj)>0 

is a valid information inequality. 

2) The inequality 

J2 cjH{Xj) - rjH{X,\XM-j) > 0, 

where is the sum of all cj involving j, is a valid 
balanced information inequality. 

The previous result states that any information inequality 
can be balanced by subtracting the corresponding terms. 
Obviously, the coefficients must be non-negative, hence the 
balanced inequality appears to be stronger 

Example 1. The ^-variable inequality 

H{X2,X3) > 

balances into the following inequality 

I{Xi:X2X3)+I{X2:X3\Xi) > 0. 

The original proof of Theorem [T] involves a random coding 
argument and Chan-Yeung's technique of entropic vectors ap- 
proximation using quasi-uniform distributions (see [31). While 
the original proof is quite involved, we present hereafter a 
simpler proof for the case of Shannon-type inequalities. 

A. Balancing the Basic Inequality 

For a set of n random variables X_^, an instance of the 
basic inequality has the form: 

I{Xj:Xk\Xl)>0, (2) 

for J, K, L nonempty subsets of Af. 

Notice first that inequality (|2]i is already balanced whenever 
J, K, L are pairwise disjoint. It is also balanced for any single 
variable in the set X^ for they appear in each term of the 
inequality (twice with coefficient 1 and twice with coefficient 
— 1). For a variable x in Xj, the inequality (|2]i is balanced for 
X iff X does not appear in B. A symmetric remark holds for 



variables in Xk- Therefore, the basic inequality (|2]l is balanced 
iff J n K ^ 0. If W ^ J n K is non-empty, inequaUty ^ 
rewrites to: 

H{Xw\Xl) + I{Xj^w-Xk-w\Xwul) > 0, 
which balances into 

I{Xw-Xj^^w\Xl) + I{X,]-w'-Xk-w\Xw\jl) > 0. 

This inequality is the sum of two (other) instances of the basic 
inequaUty, it is thus a valid Shannon-type inequality. So we 
have just proven the following proposition: 

Proposition 1. Theorem |7] holds for instances of the basic 
inequality. 

Example 2. For ^-variables information inequalities, the only 
balanced instances of the basic inequality are the following 
ones: 

liXi-.X^Xa) > 0, I{X2:XiX3) > 0, I{X3:XiX2) > 0, 
I{Xi:X2\X3) > 0,/(Xi:X3|X2) > 0, /(X2 iXglXi) > 0, 
I{Xi:X2) > 0, HXi-.Xs) > 0, I{X2:X3) > 0. 

(Note that we can recover the first line from the last two.) 

B. Balancing Shannon-type Inequalities 

By definition, a Shannon-type inequality (of the form ([T]i) is 
simply a (weighted) sum of instances of the basic inequality. 
Since the balanced property is stable by sums, balancing a 
Shannon-type inequality is the same as balancing each of the 
instances of the basic inequality in ([T]l. By Proposition [T] the 
balanced inequaUty thus obtained is valid: 

Corollary 1. Theorem\l\holds for Shannon-type information 
inequalities. 

Notice that the argument of Subsection III-AI shows that the 
balanced inequality we obtain is Shannon-type. 

C. Balancing General Information Inequalities 

For a general (non-Shannon-type) information inequality, 
we should still rely on the original proof of Theorem [1] though 
a more direct proof is not excluded. Note, however, that most 
of, if not all, the known non-Shannon-type inequalities are 
akeady balanced. 

Remark 1. Checking if a given inequality is balanced and 
balancing an inequality have linear complexity in the length 
of the inputted inequality (as a sum of joint entropies). 

III. Techniques for non-Shannon-type 

INEQUALITIES 

We describe the two main techniques for proving non- 
Shannon-type information inequalities. 



A. Zhang-Yeung's Technique 

Rule ZY 

(A) If we have an information inequality of the form: 

f[X^,YM)+9[YM.Z) + aI[Z:XM\YM)>0, 

for some a > 0; 

(B) then the following (stronger) inequality is also valid: 

The correctness of this rule is based on the following lemma. 

Lemma 1 (Copy lemma, 0). Let A, B, C be three jointly 
distributed random variables. There exists a fourth random 
variable A' such that: 

• (j4, B) and {A' , B) have the same distribution; 

• A' is independent of {A, C) given B. 
Such an A' is called a C-copy of A over B. 

Proof of Correctness of RULE ZY; Take Z' to be a Xj\f- 
copy of Z over Yj^, and apply the inequality of step (A) for 
Z = Z' . By Lemma [1] we obtain the inequality of step (B). 

■ 

This technique has been extensively used to obtain con- 
strained and unconstrained non-Shannon-type inequalities (e.g. 
R, ["61, 10, im, HH, US), as an example, we show how 
to obtain the very first non-Shannon-type inequality using this 
rule. 

Theorem 2 (Zhang and Yeung, fVT\). The following is a 4- 
variable information inequality: 

I{C:D) < I{C:D\A) + I{C:D\B) + I{A:B)+ 

+ I{C:D\A) + I{A:C\D) + I{A:D\C). 

Proof: Apply Rule ZY to the following Shannon-type 
information inequality (which can be verified using a computer 
program): 

I{C:D) < I{C:D\A) + I{C:D\B) + I{A:B)+ 
+ I{C:D\Z) + I{Z:C\D) + I{Z:D\C)+ 

+ ?,I{Z:AB\CD). 

Let Z = A m the inequality we obtain. ■ 

B. Makarychev et al Technique 

Rule MMRV 

(A) If we have an information inequality of the form: 

I[Xm.Ym)+9[Ym.Z)>Q- 

(B) then the following (stronger) inequality is also valid: 

f{XM,YM)+9{YM,Z)-rzH{Z\YM)>Q, 
where rz is the sum of coefficients of g involving Z. 

The correctness of this rule is based on a result from the works 
of Ahlswede-Gacs-Korner. The general result is presented in 



a book by Csiszar and Komer ||4l. The relevance of this result 
was also underlined by Wyner in fVi\. We state here a special 
case suited to our needs. 

Lemma 2 (Ahlswede-Komer Lemma, |[ll, H). Let 

Ui, . . . ,yn, z be n + 1 jointly distributed random variables. 
Consider their respective M i.i.d. copies Yi, .. Z . Then 

there exists a random variable Z' such that: 
. H{Z'\Yi,...,Yr,)=0, 

. H{Yj\Z') - M ■ H{yj\z) = o{M), for all 0^J CM. 
Denote this W by AK{Z:Yi,. . . , Y„). 

Proof of Correctness of RULE MMRV.- Consider the 
joint M i.i.d copies X^',Y^,Z'^' of variables Xj^,Ym,Z. 
Let Z' = AK{Z^^:Y^) be the variable obtained using 
Lemma |2] Apply the inequality of step (A) to the correspond- 
ing AI independent copies except take Z — Z' . Entropy terms 
not involving Z' are thus M times greater. Let us compute the 
entropy terms involving Z' (from g) using Lemma |2] 

H{Z')=l{Z':Y}j,)-rB{Z'\Y^) 
^H{Yj:^)-H(Y^\Z')+Q 
= M-H{Ym) - M-B{Ym\Z) + o(M) 
= M-[B{Z) ~ B{Z\Ym)] + o{M). 

Let J CM, 

H{Z', Y/^) = H{Z') + B(Yf\Z') 

= M-[B{Z)-H(Z\Ym)+H{Yj\Z)]+o{M) 
= M-[B{Z,Yj) - B{Z\Ym)] + o{M). 

Rewriting our instance of inequality [A) thus gives 

M-[f{X^, Ym) + g{YM,Z) - rzH{Z\YM)] + o{M) > 0, 

where rz is the sum of coefficients of g involving Z. Dividing 
the last inequality by M and making M tend to infinity gives 
the inequality of step (B). ■ 
As an example, we retrieve Makarychev et al proof of the 
generalization of Zhang and Yeung 4-variable inequality (see 
Theorem |2]i. 

Theorem 3 (Makarychev et al, [7||). The following is a 5- 
variable information inequality: 

I{C:D) < I{C:D\A) + I{C:D\B) + I{A:B)+ 

+ I{C:D\E) + I{E:C\D) + I{E:D\C) 

Proof: Apply RULE MMRV to the Shannon-type inequal- 
ity: 

H{Z) < I{C:D\A) + I(C:D\B) + IiA:B)+ 

+ 2H{Z\C) + 2H{Z\D). 

Let Z = E \n the inequality we obtain. ■ 
Since balancing will appear to be important in the sequel, 
we state simple properties about the two rules. 

Proposition 2. 



• Suppose inequality (B) is inferred from {A) by RULE ZY 
and V is a variable, then 

[A) is balanced for V iff {B) is balanced for V. 

• Suppose inequality (B) is inferred from (A) by 
Rule MMRV and V ^ Z is a variable, then: 

— (A) is balanced for V iff (B) is balanced for V. 

- (B) is balanced far Z. 

The proof follows immediately from the statements of the 
rules and the definition of balanced inequalities. Notice that 
Rule MMRV is only useful when applied to inequalities that 
are not balanced for Z. However, the rule balances for Z 
afterwards. 

IV. Comparison of Proofs Systems 

In the spirit of information inequality provers, we will 
consider and compare various proof systems based on the two 
rules described above. 

Definition 2. A proof system (for inequalities) consists of a 
pool P of inequalities and a rule T. A (computation) step in 
a proof system is described as follows: 

1 ) Pick an inequality {A) from the convex closure of P; 

2) Apply rule T to [A) and infer inequality (B); 

3) Add (B) to the pool P. 

A derivation is a sequence of valid steps in a system. An 
inequality {!) is provable in system S if it belongs to the 
convex closure of the pool of S after a derivation. 

Note that in the previous rules, the naming of the variables 
is unimportant. The special variable Z may change for each 
application of a rule. We want to compare the following 
systems: 

• System ZY: the system using Rule ZY. 

« System ZY+b: the system using Rule ZY and balanc- 
ing at each step. 
« System R: the system using Rule MMRV. 

• System R+b: the system using Rule MMRV and 
balancing at each step. 

Usually, a proof system will be initialized with a starting pool 
of inequalities: the (elemental) Shannon-type inequalities. 

First, we show that the two inference rules of Section |III] 
are in a sense equivalent if we keep in mind Theorem [T] about 
balanced inequalities. 

Tlieorem 4 (Equivalence modulo balancing). 

Suppose (Bi) can be inferred from (Ai) by RULE ZY, 
where (Ai) is balanced far Z. Then there is an {A2) such 
that: 

« (Bi) can be inferred from {A2) by RULE MMRV; 
« {A2) follows from {Ai); 

Suppose {B'l) can be inferred from {A' i) foy RULE MMRV. 
Then there is an {A'2) such that: 

m {B'l) can be inferred from {A'2) by RULE ZY; 

• {A'2) balances for Z into {A'l). 



Proof: 

Rule ZY ^ Rule MMRV: Suppose 

J{XM,YM)+9{YM,Z) + aI{Z:XM\YM)>0, (^1) 

for some a > 0, is a valid information inequality. By 
Rule ZY, the stronger 

f{X^r,YM)+g{YM..Z)>0 (Bi) 

is also valid. Let us show that inequality ^B^ can also be 
obtained using Rule MMRV. and balancing. Start from the 
inequality 

f{X^,YM)+g'{YM,Z)>0 (A2) 

defined using g' — g + aH {Z\Ym) ■ This inequality is valid 
since a is non-negative, thus follows from and 

H{Z\Ym) > I{Z:X^\Ym). 
By applying Rule MMRV we get 

/{X^.Ym) + g'{YM: Z) - r'zH{Z\YM) > 0, (B2) 

where r'^ is the sum of coefficients of g' involving Z. By 
definition of g' we have r'^ = a + rz, where rz is the sum of 
coefficients of g involving Z. Thus inequality ^B^ rewrites to 



f{X^, Ym) + g{YM. Z) - rzH{Z\YM) > 0, 

and since is balanced for Z, i.e., rz = 0, the previous 
inequality is exactly inequality (B^. 



(A'2) 



Rule MMRV ^ Rule ZY: Suppose 

f{X^r,YM)+g{YM,Z)>0 



is a valid information inequality. By Rule MMRV, the 
stronger 



f{X^r,YM)+g{YM..Z)-rzH{Z\YM)>0 



(B'2) 



is also valid. Let us show that inequality ( B'2 1 can also be 
inferred using Rule ZY and balancing. 
Notice first that 

H{Z\Ym) = H{Z\XmYm) + I{Z:Xm\Ym). 
rewrites to 



therefore (A'2 



f{X^,YM) + [g{YM, Z) - rzH{Z\YM)] + 

+ rzH{Z\X^YM) + rzI{Z:Xj^\YM) > 0, 

where rz is the sum of the coefficients of g involving Z. 
Balancing this inequality for Z gives: 

f{X^,YM) + [g{YM, Z) - rzH{Z\YM)] + 

+ rzI{Z:X^f\YM)>0 (A'^) 



Applying the inference rule of Rule ZY to ( A'-^ 1 gives 

f{X^f,YM)+g{YM,Z)-rzH{Z\YM)>0, (Si) 



which is exactly inequality {B'2 



This result shows the importance of balancing non-Shannon- 
type inequalities. For a Shannon-type inequality, its balanced 
counterpart is also Shannon-type and thus already belongs to 



the pool. However, the balanced counterpart of a non-Shannon- 
type inequality may not belong to the pool. 

Corollary 2. Let (I) be an information inequality. The fol- 
lowing are equivalent: 

. (I) is provable in SYSTEM R+B. 

• (X) is provable in SYSTEM ZY+B. 

• (I) is provable in SYSTEM R when using only inequali- 
ties balanced for all variables but Z. 

• (I) is provable in SYSTEM ZY when using only balanced 
for Z inequalities. 

Proof: Follows immediately from Proposition |2] and The- 
orem |4] ■ 

V. Conclusion 

We have shown that it does not matter which of the two 
rules to implement in an information inequality prover, as long 
as it applies them to balanced inequalities. Since the cost of 
checking and balancing an inequality is minor, and balanced 
inequalities are stronger than their counterparts, they should 
be useful for such programs. Moreover, we have seen that 
balancing is not compulsory at each step because the rules 
can only improve balancing (Proposition |2). A last argument 
in favour of balanced inequalities might be the fact that they 
are the only inequahties valid for continuous entropy (see E 
Theorem 2]). 
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